Time Spent Outdoors Partly Accounts for the Effect of Education on Myopia

Purpose The purpose of this study was to investigate if education contributes to the risk of myopia because educational activities typically occur indoors or because of other factors, such as prolonged near viewing. Methods This was a two-sample Mendelian randomization study. Participants were from the UK Biobank, Avon Longitudinal Study of Parents and Children, and Generation R. Genetic variants associated with years spent in education or time spent outdoors were used as instrumental variables. The main outcome measures were: (1) spherical equivalent refractive error attained by adulthood, and (2) risk of an early age-of-onset of spectacle wear (EAOSW), defined as an age-of-onset of 15 years or below. Results Time spent outdoors was found to have a small genetic component (heritability 9.8%) that tracked from childhood to adulthood. A polygenic score for time outdoors was associated with children's time outdoors; a polygenic score for years spent in education was inversely associated with children's time outdoors. Accounting for the relationship between time spent outdoors and myopia in a multivariable Mendelian randomization analysis reduced the size of the causal effect of more years in education on myopia to −0.17 diopters (D) per additional year of formal education (95% confidence interval [CI] = −0.32 to −0.01) compared with the estimate from a univariable Mendelian randomization analysis of −0.27 D per year (95% CI = −0.41 to −0.13). Comparable results were obtained for the outcome EAOSW. Conclusions Accounting for the effects of time outdoors reduced the estimated causal effect of education on myopia by 40%. These results suggest about half of the relationship between education and myopia may be mediated by children not being outdoors during schooling.

Participants were classified as having European ancestry if their first two genetic principal components (PCs) [data field #22009] were within the mean  10 standard deviations of all unrelated UK Biobank participants who self-reported their ethnicity as White British 46 and if they had genotype heterozygosity [data field #22004] also within the mean  10 standard deviations of all unrelated UK Biobank participants who self-reported their ethnicity as White British.We selected participants of European ancestry who had information available for time spent outdoors in summer [data field #1050], place-of-birth northing coordi-te [data field #129], place-of-birth easting coordi-te [data field #130] and Assessment Centre [data field #54] were filtered to exclude those related to a sibling and to exclude those who were included in the GWAS for spherical equivalent refractive error (see Note S2).
This resulted in a sample of 280 891 individuals.Participants' age at the baseline assessment visit was calculated from their date of visit [data field #53] and their year of birth [data field #34] and month of birth [data field #52].
A GWAS for time outdoors in summer was performed with BOLT 47  Centre.This resulted in a sample of 72 576 individuals.The refractive error of each participant was taken as the spherical equivalent refractive error (sphere plus 0.5  cylinder) averaged between the two eyes [48][49][50] .

Note S3. GWAS for EAOSW in adults from UK Biobank
The GWAS for EAOSW was performed in the same sample of 72 576 individuals at the GWAS for spherical equivalent refractive error (Note S2).Participants were classified as having an early age-ofonset of spectacle wear [data field # 2217] if they reported first wearing glasses or contact lenses at or before the age of 15 years old (EAOSW = 1).All other participants were classified as not having an early age-of-onset of spectacle wear (EAOSW = 0), including participants who did not answer this question (for example, because they never wore glasses or contact lenses).
A GWAS for the bi-ry outcome EAOSW was performed with the glm function in R. Age, Agesquared, sex, northing coordi-te, easting coordi-te, genotyping array, the first 10 genetic ancestry PCs, and Assessment Centre (one-hot encoded) were included as covariates.

Supplement page 5
Note S4.Derivation of polygenic scores for EduYears and Time Outdoors (a) The polygenic score for time outdoors was derived by repeating the GWAS for time outdoors in summer (as described in Note S1) except that (i) only variants in the HapMap3 set (file 'map.rds'available at: https://doi.org/10.6084/m9.figshare.19213299)were included and (ii) the BOLT option --predBetasFile was selected to assume an infinitesimal model.
(b) The polygenic score for EduYears was derived using the summary statistics from the GWAS for EduYears reported by the Within Family GWAS Consortium 51 (dataset ieu-b-4836).The GWAS summary statistics were filtered to exclude variants not present in the HapMap3 set (file 'map.rds'above).Next, LDpred2 was used to account for linkage disequilibrium between variants, with default settings for an infinitesimal model (function snp_ldpred2_inf from the R package bigsnpr) 52 .

Note S5. ALSPAC/Generation R question-ire response coding and polygenic score a-lyses
In the ALSPAC study, question-ire items relating to time spent outdoors followed the format, "How much time, on average, on a typical [school weekday / weekend day / school holiday] does your child spend out of doors in [summer / winter]."Items relating to time spent reading for pleasure followed the same format.The question-ire response options were: None; Less than one hour; One to two hours; Three or more hours.These responses were converted from an ordi-l to a pseudo-continuous scale by assigning values of 0, 0.5, 1.5 and 3 hours, respectively, to the above responses.The average time spent per day was calculated as: (hours on a weekday  5/7) + (hours on a weekend day  2/7).
Assuming that summer comprised 13 weeks (6 weeks of holidays and 7 weeks non-holidays) and that winter comprised 13 weeks (3 weeks of holidays and 10 weeks non-holidays), the average time spent per day across the whole year was calculated as: (hours on a summer non-holiday  7/26) + (hours on a summer holiday  6/26) + (hours on a winter non-holiday  10/26) + (hours on a winter holiday  3/26).
In the Generation R study, question-ire items relating to time outdoors and near work typically had the following format: "How much time do you [does your child], on average, spend outside [reading]   during a school weekday [weekend day]."The response options were: None, 0-30 minutes, 30-60 minutes, 1-2 hours, 3-4 hours, more than 4 hours.These responses were converted to a pseudocontinuous scale using the same approach as above.
A-lyses of polygenic score associations were carried out separately in the ALSPAC and Generation R cohorts.The polygenic score of each participant in the cohort was calculated using the --score function of PLINK 53 version 1.9, and was standardised to have a mean of zero and a standard deviation of one.The polygenic score for time outdoors was calculated using the beta coefficients described in Note S1(a), while the polygenic score for EduYears was calculated using the beta coefficients described in Note S1(b).
The following two regression equations were fitted for each of the time outdoors or time reading variables.All participants of European ancestry with genotype data and question-ire response data were included in the a-lysis: =   +  + 1 + 2 … + 10 (eq. 1) where,   is the standardized polygenic score for time outdoors, and   is the standardized polygenic score for EduYears.

Note S6. Technical details of Mendelian randomisation a-lyses
Steiger's test 54 was used to identify variants more strongly associated with the outcome than with the exposure; no variant failed the test (Steiger's test, P > 0.05).The correlation between IVs was computed using the command --r in PLINK 53 version 1.9.IVs with a squared correlation > 0.1 not already removed by the clumping procedure (due to being separated by more than 1000 kb or due to being closely-situated IVs for the two traits in an MVMR a-lysis) were pruned by removing the less strongly associated variant of each pair.MR a-lyses were carried out for the continuous outcome spherical equivalent refractive error or the bi-ry outcome EAOSW.
Univariable MR a-lysis methods included MR-IVW, Egger (MR-EGGER), weighted median (MR-MEDIAN) and mode-based (MR-MBE) a-lyses, which were performed using the mr_ivw, mr_egger, mr_median and mr_mbe functions, respectively, from the R package MendelianRandomization 55 .The MR-PRESSO a-lysis was performed using the mr_presso function from the R package MRPRESSO 56 .A robust (MR-ROBUST) a-lysis was performed with the mvmr_robust function from the R package robust-mvmr 57 .
Multivariable MR a-lyses were carried out using the inverse variance weighting (MVMR-IVW), Egger (MVMRMR-EGGER) and weighted median (MVMR-MEDIAN) methods, performed with the mr_mvivw, mr_mvegger and mvmr_median functions, respectively, from the R package MendelianRandomization 55 .An MVMR mode-based a-lysis was performed with the mv_mrmode function from the R package MVMRmode 58 .An MVMR-PRESSO a-lysis was performed after removing the outlier IVs detected in the two univariable MR-PRESSO a-lyses 56 .A robust (MVMR-ROBUST) a-lysis was performed with the mvmr_robust function from the R package robust-mvmr 57 .
Supplement page 8

Note S7. Results of sensitivity a-lyses
The results from the sensitivity a-lyses are shown in Supplementary Tables S4 and S5 and Supplementary Figures S2-S6.An F-statistic above 10 is indicative of robustness against weak instrument bias in multivariable Mendelian randomisation a-lyses 59 .The F-statistics from the inversevariance weighted multivariable a-lyses were F = 12.1 for the years spent in education IVs and F = 19.6 s for the time outdoors IVs, suggesting they were robust to weak instrument bias (in the a-lyses using spherical equivalent refractive error as the outcome and in the a-lyses using the risk of EOASW as the outcome).There was strong evidence in the univariable and multivariable a-lyses of heterogeneity in SNP-exposure vs. SNP-outcome effect sizes: Cochran's Q-statistic was Q = 114.6,P = 1.40e-07, for the inverse-variance weighted multivariable a-lysis of spherical equivalent refractive error, and Q = 83.5,P = 8.30e-04, for the equivalent a-lysis of EAOSW.Such heterogeneity can be indicative of horizontal pleiotropy.However, repeating the univariable and multivariable a-lyses using a series of Mendelian randomisation methods designed to provide valid causal effect estimates in the presence of horizontal pleiotropy yielded highly comparable results to those from the inverse-variance weighted method.The confidence intervals were wide for the MR-Egger and MBE-MR methods (Tables 1 and 2), yet the causal effect estimates from these methods overlapped those from the other a-lysis methods.There was no indication of directio-l horizontal pleiotropy: MVMR-Egger intercept = 0.002 D, P = 0.83, for the outcome spherical equivalent refractive error, and MVMR-Egger intercept = -0.005,P = 0.54, for the outcome EOASW.Increasing the number of IVs for the two exposures by relaxing the GWAS p-value threshold for selecting SNPs from P < 5e-08 to either P < 1e-07 or P < 1e-06 yielded consistent causal effect estimates.Previous work failed to find a significant effect of refractive error on years spent in education 60,61 , and here, a univariable Mendelian randomisation alysis provided little evidence that refractive error influenced the time children spent outdoors (Supplementary Table S6).
Values are median (interquartile range) unless otherwise specified.Note that refractive error was not measured in the majority of the GWAS for time outdoors sample, as autorefraction was only introduced towards the end of the UK Biobank recruitment period.

GWAS for refractive error
No

Supplement page 10
Table S2.Association of polygenic scores for time outdoors or EduYears with the time ALSPAC participants spent outdoors or reading.
(These data are plotted in Figure 1).

Supplement page 11
Table S3.Association of polygenic scores for time outdoors or EduYears with the time Generation R participants spent outdoors or on near work.
(Some of these data are plotted in Figure 1).
for 9 572 557 imputed SNPs with minor allele frequency (MAF) ≥ 0.01, imputation quality metric (INFO) ≥ 0.8 and per variant genotyping call rate ≥ 0.95.Age, Age-squared, sex [data field #22001], northing coordi-te, easting coordi-te, genotyping array [data field #22000], the first 10 genetic ancestry PCs, and Assessment Centre (one-hot encoded) were included as covariates.Supplement page 3 Note S2.GWAS for spherical equivalent refractive error in adults from UK Biobank Calculation of participants' age at the baseline assessment visit and classification of individuals of European ancestry are described in Note S1.We selected unrelated participants [data field #22011] of European ancestry who had information available for non-cycloplegic autorefraction [data fields #5084-5088], place-of-birth northing coordi-te, place-of-birth easting coordi-te and Assessment

Table S4 . Full MR and MVMR results for outcome: Spherical equivalent refractive error.
Results are expressed in units of Diopters.